Mathematical modeling and optimizing the in vitro shoot proliferation of wallflower using multilayer perceptron non-dominated sorting genetic algorithm-II (MLP-NSGAII)

Novel computational methods such as artificial neural networks (ANNs) can facilitate modeling and predicting results of tissue culture experiments and thereby decrease the number of experimental treatments and combinations. The objective of the current study is modeling and predicting in vitro shoot proliferation of Erysimum cheiri (L.) Crantz, which is an important bedding flower and medicinal plant. Its micropropagation has not been investigated before and as a case study multilayer perceptron- non-dominated sorting genetic algorithm-II (MLP-NSGAII) can be applied. MLP was used for modeling three outputs including shoots number (SN), shoots length (SL), and callus weight (CW) based on four variables including 6-benzylaminopurine (BAP), kinetin (Kin), 1-naphthalene acetic acid (NAA) and gibberellic acid (GA3). The R2 correlation values of 0.84, 0.99 and 0.93 between experimental and predicted data were obtained for SN, SL, and CW, respectively. These results proved the high accuracy of MLP model. Afterwards the model connected to Non-dominated Sorting Genetic Algorithm-II (NSGA-II) was used to optimize input variables for obtaining the best predicted outputs. The results of sensitivity analysis indicated that SN and CW were more sensitive to BA, followed by Kin, NAA and GA. For SL, more sensitivity was obtained for GA3 than NAA. The validation experiment indicated that the difference between the validation data and MLP-NSGAII predicted data were negligible. Generally, MLP-NSGAII can be considered as a powerful method for modeling and optimizing in vitro studies.


Introduction
Erysimum cheiri (L.) Crantz, commonly named wallflower, is a biennial or perennial ornamental plant which belongs to the Brassicaceae family. Wallflowers are grown all over the world in a variety of colors as an important bedding and garden flower [1,2]. This species is widely used as a popular landscape plant, flowering pot plant, and also as a rock garden flower [2]. also been extensively used as cardioactive, antifissure and anti-inflammation, emmenagogue, fertilizer and anti-tumor in the traditional medicine [3]. Due to the increasing demands of the market for the valuable ornamental plant, conventional propagation and breeding approaches are no longer sufficient and it is necessary to establish high quality biotechnological methods. Plant tissue culture is an important agricultural biotechnology technique that provides the production of crops with uniform characteristics in a short time and cost-effective systems under aseptic conditions [4]. There is no report for micropropagation of Erysimum cheiri so a robust and efficient protocol has yet to be fully developed. Therefore, the use of breeding methods and biotechnological techniques in this plant is encountered to some limitations. Several intrinsic factors (e.g., genotype, organ type, and explant developmental age) and also external parameters (e.g., vitamins, plant growth regulators (PGRs), carbohydrate source, temperature, and light) delimit in vitro shoot growth and development [5]. PGRs play a vital role for in vitro organogenesis such as shoot proliferation [6,7], shoots organogenesis [8], somatic embryogenesis [9], and callus induction [8,10]. Therefore, finding the optimized amount of media compositions for achieving ideal results is a versatile challenge for de novo plant micropropagation [4,11]. Since the traditional analytical methods such as linear regression are not suitable for non-linear biosystem [7,12], artificial neural networks (ANN) as the non-linear modeling techniques have become a reliable method to predict and optimize the correlations between the input and output of a biological process [12,13]. ANNs include of numerous highly interconnected processing neurons that work in parallel to find a solution for a specific problem [14]. ANNs are learned by examples. The examples should be intently chosen otherwise the model might be working inaccurately [14,15]. ANNs are able to recognize the relationship between output and input variables and identify the inherent knowledge existent in the datasets without previous physical considerations. Hence, ANNs are considered as a "black box" [16,17]. Multilayer Perceptron (MLP), is one of the common artificial neural networks (ANNs) applied for modeling and predicting in vitro culture processes [4]. MLP is inspired by the neural structure of the human brain and consists of an input layer, one or more hidden layers, and an output layer [14]. Like the human neural network, ANNs contain nodes, each of which receives a number of input variables and produce a single target variable that is a relatively simple function of the input variables [18]. The connections are based on weights given by values that were defined in the training process so that the output values will be as similar as possible to the values that were obtained from the training model. Network fitting is conducted by means of the back-propagation algorithm, which estimates the weights through the connections that are performed in the opposite direction of the subsequent layer [14,16]. Recently, a number of reports have been published about the use of artificial intelligence models in plant tissue culture procedures [4,12,16,[18][19][20][21][22][23].
Evolutionary optimization algorithms are considered the powerful mathematic methods for solving complex, multidimensional problems such as designating optimal factors for micropropagation with high speed and accuracy [4]. There are different types of evolutionary optimization algorithms and genetic algorithm (GA) has been applied to the vast majority of plant tissue culture optimization studies relating to shoot proliferation [17], secondary metabolite production [23], and somatic embryogenesis [21]. GA is an optimization algorithms based on the principles of genetic variation and natural selection. It evolves finding the best solution for a specific problem [16]. Plant tissue culture is a multi-objective system may not be optimized using GA as a single-objective function, therefore multi-objective algorithms are necessary for the optimization of outputs [19]. Classical optimization methods, including multi-criterion decision making methods, have established a model for converting multiobjective optimization to a single-objective optimization issue through emphasizing one particular Pareto-optimal solution at a time. In this method, multiple runes are required to obtain different possible solutions [20]. One of the first evolutionary multi-objective optimization algorithms, which is useful for finding the solution domain in order to detect Pareto-optimal solutions within a multi-objective scheme is known as the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) [20]. A few reports used ANN-NSGA-II for predicting and optimizing plant sterilization [20], shoot proliferation [19] and somatic embryogenesis [21] of Chrysanthemum × grandiflorum. Hesami et al. (2019) [20] used MLP-NSGAII to achieve the highest efficiency and optimum concentrations of disinfectants as well as immersion times to minimize in vitro contamination frequency (CF) and maximize the explant viability (EV) of Chrysanthemum. The R 2 (over 94%) indicated that MLP-NSGAII was a powerful model for optimizing and forecasting in vitro sterilization of chrysanthemum. They also suggested that MLP-NSGAII can be employed as a precise method for different areas of in vitro culture. They also applied the ANFIS linked to NSGAII to optimize the appropriate hormonal combinations (2,4-D and BAP), carbohydrate (sucrose, fructose, and glucose) and light quality and further maximize the embryogenesis frequency (EF) and number of somatic embryogenesis (NSE) in chrysanthemum. They reported a high efficiency and accuracy of ANFIS-NSGAII on the modeling of the somatic embryogenesis (R 2 > 0.92) [21]. In another study [19], RBF-NSGAII was used to model and predict the optimal levels of BAP, IBA, PG and sucrose on shoot proliferation parameters in order to maximize the shoot number and shoot length and concurrently minimize the callus weight of chrysanthemum. High R 2 (> 0.76) between observed and predicted values indicated that RBF-NSGAII can be considered as an efficient computational strategy for modeling and optimizing in vitro organogenesis [19].
In this study, we tried to propose a model for shoot proliferation by using non-linear MLP-NSGAII modeling and optimization procedure. In this way, making a strong link between the MLP model and NSGAII was our first priority in order to find the highest efficiency and the optimum concentrations of PGRs for significant in vitro shoot proliferation. Generally, the objective of this study is to model and optimize the appropriate plant growth regulators' compositions for maximum shoot proliferation of E. cheiri.

Plant materials
Nodal segments (1.5 cm) were harvested from the tetraploid wallflower plants kept in the greenhouse. After washing with tap water for 30 minutes, the explants were placed in a solution of detergent and water (1:1) and washed. Subsequent disinfection steps were performed under a laminar airflow chamber by soaking the explants in 70% ethanol for 30 seconds followed by 3% sodium hypochlorite solution for 7 minutes. The explants were washed three times with sterile distilled water and then put in MS medium [24] containing 6% agar and 3% sucrose. The pH was adjusted to 5.8 using 1 N HCl or 1 N NaOH before autoclaving at 121˚C. The cultured flasks were exposed to 16/8 h (light/dark) photoperiod for 4 weeks with a light intensity of 80 umol.m -2 .s -1 and a temperature of 24 ± 2˚C.

Multilayer perceptron (MLP) model
In order to construct the MLP model, four types of PGRs were used as inputs and SN, SL, and CW were considered as outputs for the modeling of in vitro proliferation (Fig 1). MLP was applied for obtaining the maximum rate of SN and SL as well as the minimum rate of CW. Prior to modeling, the data were randomly divided into 80% training and 20% testing sets. The datasets of input and output were normalized between -1 and 1 by mapminmax transformation. To detect outliers, principal component analysis (PCA) was used; however, no outlier was identified. This model provides inputs and outputs to the network by a supervised training procedure, while the training process continues until the following function would be minimized: Where K is the number of data, yk is the kth observation output, andŷ k is the kth predicted output. In a three-layer MLP with m neurons in the hidden layer and n input variables,ŷ is calculated as:ŷ where wj: weight that connects the jth neuron of hidden layer and neuron of output layer, wji: the weight connecting the ith input variable and jth neuron of hidden layer, xi: the ith input variable, wj0: bias of the jth neuron of hidden layer, w0: bias related to the output neuron, g: the transfer functions for hidden layer, and f: transfer functions for the output layer.
In this study, three-layer perceptrons (feed forward back-propagation network) was applied with hyperbolic tangent sigmoid (tansig) and linear (purelin) for hidden and output layers transfer functions, respectively. A Bayesian Regulation was used for training of the network and determining the optimal weights and bias. Since the number of hidden units and the number of neurons in each node play an important role in the efficiency of MLP, they should be determined. There are some reports which show the optimal number of neurons in the hidden layer by means of some equations [4,25], but they ultimately should be obtained by using trial and error. The large or low number of them results in under-fitting or over-fitting, respectively. In the current investigation, trial and error-based approach was used to detect the optimal neuron number in the hidden layer.

Performance measures
Three MLP -model was trained for each of the three outputs including SN, SL, and CW. The best fitness for each model was determined based on the mean bias error (MBE), root mean square error (RMSE) and coefficient of determination (R 2 ) as follows: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Where n is the number of data, yi is the value of observed datasets, andŷ i is the value of predicted datasets and � y i is the mean of observed values. RMSE and MBE values closer to 0, and R 2 values closer to 1, show best performance of the constructed models [4,12,26]. The higher R 2 and lower RMSE and MBE indicated better performance of the designed models [15].

Optimization algorithm NSGA-II
The developed MLP model, as the fitness function, was subjected to additional practice using NSGA-II to determine the optimum amounts and combinations of input variables to achieve the best values of outputs (Fig 2). This algorithm first generates a number of random solutions and then the objective function is calculated for each solutions. The search for optimal solutions during NSGA-II implementation was limited to the lower and upper bounds of the input variables [20]. The binary tournament operator was used to select elite populations for crossover, based on two criteria: non-dominated sorting and crowding distance, which are two characteristics of a good pareto front. A mutation operator was applied to protect the algorithm from getting stuck in the local optimum. When the refining solutions are determined, the objective function values were recalculated and continued until one of the terminated criteria were attained [4]. In each generation, non-dominated solutions in objective space constitute a Pareto front; any point on this front can be an optimal solution of the problem [4].
We considered SN, SL and CW as three objective functions to recognize the optimum values of inputs based on the results of MLP Model. In this study, 50 initial population, 800 generation number, 0.8 crossover rate and 0.01 mutation rate were set. An ideal point on the pareto front is calculated so that while SN and SL were maximized and CW was minimized, the solution obtained from the following equation is minimized.
E ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Where n and l are the maximum SN, and SL respectively, and c is the minimum CW in the observed data. Objective function values were scaled between 0 and 1 before applying.

Sensitivity analyses
Sensitivity analysis was applied on the obtained ANN model to determine the importance of input variables in the model.

Validation experiments
In order to evaluate the efficiency of the MLP-NSGAII, optimized hormonal combinations of shoot proliferation parameters (i.e., SN, SL, and CW) were tested experimentally.

Results
In this study, the effects of four PGRs on shoot number (SN), shoot length (SL), and callus weight (CW) of E.cheiri were studied. The combination of two cytokinins BA and Kin showed that the concentrations of 2 mg.l -1 BA+ 2 mg.l -1 Kin and 2 mg.l -1 BA+ 1 mg.l -1 Kin produced the highest shoot length (3.90 cm and 3.86 cm, respectively), while the highest callus weight (0.25g) was formed in 2 mg.l -1 BA+ 2 mg.l -1 Kin. Also the combination of 1 mg.l -1 BA + 2 mg. l -1 Kin and 2 mg.l -1 BA + 1 mg.l -1 Kin produced more number of shoots (6.06 and 5.81, respectively). No SN, SL and CW were observed on MS medium without PGRs (control treatment).  Table 1). The highest weight of callus (0.3 g) was observed in 2 mg.

ANN modeling and evaluation
MLP was used to model and predict the effect of BA, Kin, NAA, and GA 3 on shoot proliferation parameters including SN, SL, and CW of E. cheiri. R 2 , RMSE, and MBE of this model were presented in Table 2. Higher significant R 2 value and lower RMSE and MBE values proved the model's capability. The regression graphs (Figs 3-5) that presented this correlation, was efficient in predicting the outputs, and the values estimated by MLP were similar to the results of the experimental data ( Table 2).

Model optimization
The final aim of this study was to optimize the MLP model by NSGA-II for providing accurate concentrations of PGRs and also obtain maximum SN, and SL and minimum CW. The optimal SN (7.12), SL (3.99 cm), and CW (0.21 g) can be obtained from a medium containing 1.41 mg.l -1 BA, 1.17 mg.l -1 KIN, 0.04 mg.l -1 NAA and 0.14 mg.l -1 GA 3 (Table 3).

Sensitivity analyses
The importance of each input was evaluated through the VSR achieved for every output (SN, SL, and CW) ( Table 4). The results of sensitivity analysis were indicated in (Table 4). Based on the sensitivity analysis, shoot number and callus weight were more sensitive to BA, followed by Kin, NAA and GA 3 ( Table 4). The most important factors which affected shoot length (SL), were BA followed by Kin, GA 3 and NAA (Table 4). In contrary with SN and CW, SL was more sensitive to GA 3 than NAA (Table 4).

Validation experiment
According to the validation experiment (Table 5), there was negligible difference between experimental validation data and predicted data via MLP-NSGAII. The predicted hormonal compositions via MLP-NSGAII resulted in 7.1 SN, 3.67 cm SL, and 0.19 g CW (Table 5) ( Fig  6A-6C).

Discussion
The reliability and applicability of machine learning as one of the powerful computational approaches have been recently reviewed in different areas of plant science such as in vitro culture [4], plant breeding [27], stress phenotyping [28], and system biology [29]. Moreover, the accuracy of ANNs has been recently approved for modeling, prediction, and optimization of different in vitro culture systems such as sterilization [20,30], seed germination [5,31], callogenesis [32,33], shoot proliferation [19,[34][35][36], somatic embryogenesis [37,38], androgenesis [39], gene transformation [40,41], and secondary metabolite production [42,43]. There are many approaches for optimizing the culture medium for plant tissue culture, but there is not a universal protocol that can be used to modify a micropropagation medium for a large number of plants. Optimizing in vitro micropropagation as a multivariable and complex system is a highly tedious, expensive, and time-consuming process and traditional statistical methodology such as regression models alone are not reliable for approximation of these nonlinear variables [4]. Therefore, there is a serious need for the application of new computational approaches such as ANNs to analyze and optimize this type of system more efficiently using fewer treatments [14,20]. MLP model as one of the most popular types of ANNs, is employed in many micropropagation studies and contains three main parts: one input layer, one or more hidden layers and one output layer, which can be successfully employed for prediction, classification, signal processing and error filtering [14,18]. Training and designing of ANN encounter to several problems. One of the most important problems is assigning the weights in ANN structure which demonstrates the direct effect on model performance [14]. The genetic algorithm is applied to find the optimal point of complex nonlinear functions in integrating with the artificial neural network that has a lot of advantages such as increasing the accuracy of ANN by updating the weights and bias values [14,18]. Therefore, the hybridization of ANNs and multi-objective optimization algorithms can be considered as an accurate and reliable methodology for predicting and optimizing in vitro culture [5]. High coefficient of determination between observed and predicted values for both training and testing process indicated good performance of the models for the studied parameters [19]. The high efficiency of MLP in plant tissue culture has been shown by several studies. For instance, Arab et al. [22] employed MLP-GA for modeling and anticipating the optimal hormonal combinations in G × N15 vegetative rootstock proliferation. They reported the high accuracy of MLP-GA models (R 2 > 0.81).

PLOS ONE
Zhang et al. [44] used MLP for modeling and predicting organogenic callus production of melon and reported that MLP was able to accurately model and predict the system (R 2 > 0.96). In another study, MLP was employed to model and predict in vitro root formation in grapevine [45]. They reported the high accuracy of MLP-GA model (R 2 > 0.78). However, GA as a single-objective algorithm cannot optimize multi-objective functions related to tissue culture problems simultaneously [19]. Therefore, a multi-objective evolutionary algorithms (MOEAs) has been required to optimize the outputs [20]. The main benefit of MOEAs is that they generate reasonably good approximations of the non-dominated frontier during a single run and in limited computational time [46,47]. NSGAII as a multi-objective evolutionary algorithms generates a group of non-dominated solutions (identified as Pareto-optimal solutions) to find an equivalent solution between different objective functions and improves each of the objective functions without worsening other function values to guides the population towards the Pareto front [4]. In this study, CW had a negative effect on micropropagation due to the somaclonal variation and limitations of vascular system. Therefore, NSGAII algorithm was hybridized with MLP model to find the accurate concentration of hormonal compositions applied to obtain the maximum SN, SL as well as the minimum CW. Hesami et al. [20] applied MLP-NSGAII to optimize different types and concentrations of disinfectants and immersion time to minimize in vitro contamination and maximize the viability of Chrysanthemum explants. They reported the high accuracy of MLP-NSGAII model (R 2 > 0.94). In this work, we used MLP-NSGAII model to predict and optimize the hormonal combinations on shoot proliferation of E. cheiri and to achieve a new insights into improving in vitro culture. High coefficient of determination (0.84, 0.99 and 0.93 for SN, SL, and CW, respectively) between observed and predicted values for both training and testing processes showed that this method can be considered as an efficient method for analyzing and predicting in vitro growth condition for E. cheiri (Table 2). Therefore, validation experiments confirmed the results predicted by this method (Table 5).
Both the type and concentration of hormonal combinations play a critical role in plants' proliferation, therefore each plant species needs a special concentration of hormones according to its internal hormones content [22]. Auxins and cytokinins, as the major plant hormones affect the cell division and multiplication of plant tissues [32]. Among them, cytokinins are believed to be the most important plant hormone responsible for stimulation of cell divisions and shoot proliferation [48]. In some cases, combination of cytokinins have proved more effective leading to increased proliferation rate [48]. This is in the line with our results which showed that increasing the concentration of two cytokinins BA and KIN up to 2 mg.l -1 and 1 mg.l -1 respectively, led to enhanced shooting parameters (Table 1). Further concentrations of cytokinins' combination reduced the shoot production and increased callus formation. Nowakowska et al. [48] in the same way reported that too high cytokinin concentrations in the medium inhibited shoot multiplication and negatively affected shoot length. Many other researchers have also reported similar results [49,50]. Auxins are the other vital plant hormone that triggers many plant activities, such as root and shoot production, stimulation of callus cell divisions, and inducing apical domination [51]. Low contents of auxins along with high concentrations of cytokinins affect cell divisions and are responsible for the in vitro regeneration and shoot proliferation [52,53]. Auxins are considered to exhibit synergistic, antagonistic and additive interactions with cytokinins depending on the plant species and tissue type in the regulation of physiological responses [54]. In fact cytokinin acts as a positive regulator of auxin biosynthesis and auxin as a negative regulator of cytokinin biosynthesis and both are controlled by a homeostatic regulatory mechanism [55]. NAA is the only auxin that does not require active uptake to easily pass through the plasma membrane into cells and has a synergistic/additive effect on shoot proliferation [56]. In our experiment the combined use of BA and NAA caused higher plant shooting than when using them alone (Table 1). These results are in line with the findings of other species such as G × N15 rootstock [22], Daphne mezereum 'Alba' [48], Chinese ginger (Boesenbergia rotunda) [57], Cassia angustifolia [58], Magnolia sirindhorniae [59] and Santolina canescens [60]. However, high concentrations of hormonal

PLOS ONE
combinations decrease shoot length and shooting rate and conversely increase callus development [22,61] which confirms our findings in E. cheiri (Table 1). Gibberellins are other important plant growth regulators which its most active and popular member, gibberellic acid (GA 3 ) plays important roles in plant development, seed germination, shoot elongation and flower induction [48,62,63]. Combinations of cytokinin and GA 3 were found to be the best for both shoot multiplication and shoot elongation [59,64], which our results are in accordance with these findings (Table 1). Contrary reports have shown less shoots when BA was combined with GA 3 compared to with BA treatment alone [48,59,65,66]. The media not supplemented with cytokinins had a high GA 3 absorption, suggesting that the presence of cytokinins could negatively affect explant GA 3 uptake [67]. High GA 3 concentration in combination with low BAP concentration was necessary for high shoot elongation in kiwifruit (Actinidia arguta) [63]. Moreover, a number of researches have shown that combination of GA 3 and NAA indicate a positive effect on in vitro shoot multiplication of plants [6,56,68]. Optimization of auxin concentration is regarded as a key factor in controlling plantlet height [45]. Zhang et al. [69] suggest that the shoot length of potato explants was increased when higher concentrations of IAA were used. However, the effect of IAA is improved by the addition of GA 3 [69]. Furthermore, combination of GA 3 and NAA concentrations in a number of plants demonstrated increased shoot length [70,71]. Several studies have shown that high callus production also occurs due to increased concentrations of auxins in plants [72,73]. In vitro plant regeneration is mainly dependent on exogenous and endogenous phytohormones [4,74]. Due to the high callus production rate of E. cheiri, it is considered that in addition to external growth regulators applied, the amount of internal auxin in this plant is also possibly high. Based on the results of sensitivity analysis (Table 4), BA was found to be superior than Kin, in terms of the overall number and length of shoots produced per explant, which is in agreement with findings of Akbas et al. [75]. Finally, according to the validation experiment, MLP-NSGAII as a new computational algorithm in analyzing data derived from in vitro culture, could be able to propose the optimal level of hormonal combinations to achieve the most appropriate results of the investigated parameters.

Conclusion
Plant tissue culture is a complex process in which many diverse factors are involved. In order to achieve an optimum protocol, several treatments with multiple replications and numerous trials and errors are designed, which have proved to be very costly and time consuming. Recently some computational techniques such as ANN models have been suggested to analyze and optimize multi objective processes. In this study for the first time, MLP-NSGAII was implemented for E. cheiri as a new computational tool for optimizing and predicting in vitro shoot proliferation. Based on the results, MLP-NSGAII is an efficient method for modeling and optimizing the hormonal combination for plant in vitro culture. We anticipate that it may be applicable for other plant species and other features such as mineral compounds, light and temperature as well. In future studies, MLP should be compared with other machine learning algorithms.